Problem Set N°1

Valentina Andrade

August 27, 2011

Abstract

The following report contains the exercises requested in problem set 1. In the first part you can download the proofs of some properties and/or results related to AR, MA and ARMA process. In the second part, the Box-Jenkins methodology is applied to study three series of the Chilean economy: inflation, exchange rate and IPSA. One of the most important results of both exercises is related to how to apprehend time series structures, either theoretically or empirically we can say something that Wold ‘s theorem had already anticipated’‘Any stationary series can beexpressed as the sum of two components: a perfectly forecastable series and a moving average of possily infinite order’

Part 2

ARMA models have been presented as a parsimonious tool to describe non-stationary stochastic processes. In theory, non-stationary series can be represented by an MA(\(\infty\)), i.e., capturing the entire memory of the series.

In practice this is very expensive, so we will show how we can approximate an MA(\(\infty\)) from an ARMA(\(p,q\)) model, with few parameters (i.e. \(p+q\) is small). We will be guided by the methodology of Box and Jenkins to achieve this task.

  1. In order to use ARMA we need the non-stationary components or “trends around the mean” or “trends around the variance” to be removed. In addition to using transformations, we test a unit root test (Dickey Fuller’s test).

  2. Other deterministic components are removed. In our case this is important because before 2001 we find that there is a clear inflationary path, and that this is evidently due to the change in the monetary policy regime (3% rule).

  3. Third, we compute ACF and PACF to identify the order and type of the underlying model.

  4. The model is estimated assuming the proposed model with p and q.

  5. Identification tests are performed and the adequacy of the identification is evaluated. In this report we give importance to AIC and Ljung-Box.

  6. In-sample predictions of the estimated model are made.

Data exploration

Figure 1. Series infsv_sa (IPC), IPSA_sa (IPSA), tcn_sa (Exchange Rate CLP/USD) (1990- 2022)

Pese a que las series utilizadas son las desestacionalizadas, as can be shown in Figure 1, inflation (measured by the consumer price index) presents a clear trend before 2001. This price growth trend was stabilized after the Central Bank set a target of around 3% inflation and a policy of nominalization (Fuentes et al, 2003). Similarly, since 2020, due to the health crisis caused by the COVID-19 pandemic, the consequences have also been reflected in an increase in the cost of living.

In order to isolate the trends mentioned above, we have chosen to limit the period of analysis from 2001 to 2020, both for inflation and for the other variables of interest, in order to make the models more comparable. We will use the series shown in Figure 2 for the following steps.

Figure 2. Series infsv_sa (Inflation), ipsa_sa (IPSA), tcn_sa (Exchange Rate CLP/USD) (2001- 2022)

1. Inflation

First, in Figure 3 we have a clear representation of an increasing trend in the price level. As we mentioned at the beginning, ARMA models work on the basis of non-stationary series, but graphically it seems that inflation still has a trend component. We will use the Dickey Fuller unit root test to conjecture if there is evidence of this trend. Formally,

\[\triangle Y_t = \alpha + \phi y_{t-1}+ \varepsilon\]

\(H_0: \phi = 0\Rightarrow\) Presence of stochastic trend in the observations.

\(H_1: \phi <0 \Rightarrow\): No presence of stochastic trend in the observations.

Table 1. Dickey-Fuller Test for Inflation series
method Valor-p statistic parameter alternative resultado 95%
Augmented Dickey-Fuller Test 0.2727477 -2.720626 6 stationary Existe unit-root

Table 1 shows that with 95% confidence, we cannot reject the null hypothesis. That is, it is likely to say that there is a stochastic trend in this series. Our calculations show that it is a trend in means so it can be solved with a simple differencing (if it were a trend in variances a logarithmic transformation would be appropriate). After the transformation we plot the series in Figure 4.Table 2 shows that we can now reject the null hypothesis with 95% confidence.

Table 2. Dickey-Fuller Test for diff(inflation)
method Valor-p statistic parameter alternative resultado 95%
Augmented Dickey-Fuller Test 0.01 -7.911852 6 stationary Es I(0), no unit-root

We will now explore the order of the AR and MA processes. On the one hand, the ACF gives us information about the order \(q\) of the MA. The figure is not very clear about whether the value is at 1 or much higher (there are values near to 14). On the other hand, the (partial) PACF gives us the p-value, i.e., the order of the AR(p) process. The figure shows with much more certainty that the process “dies” between 3 and 5. Evidently the value 5 could be possible only because of a convenience of the size of the interval.

Table 3. ARMA(p,q) Iteration
Ajuste sigma logLik AIC BIC Box-Ljung test residuos p value
ARMA(3, 4) 0.1266875 156.4905 -294.9810 -263.6553 0.9888481
ARMA(5, 2) 0.1278088 156.1140 -294.2279 -262.9022 0.9870518
ARMA(5, 3) 0.1280881 156.1609 -292.3218 -257.5155 0.9719941
ARMA(4, 3) 0.1287374 154.7641 -291.5282 -260.2024 0.4883086
ARMA(5, 5) 0.1260530 157.6999 -291.3998 -249.6321 0.8613235
ARMA(5, 4) 0.1283396 156.1747 -290.3494 -252.0624 0.9737545
ARMA(1, 1) 0.1310111 148.5588 -289.1176 -275.1951 0.9331472
ARMA(3, 5) 0.1287390 153.7904 -287.5808 -252.7744 0.9201097
ARMA(2, 1) 0.1312204 148.6825 -287.3650 -269.9618 0.9902070
ARMA(1, 2) 0.1312435 148.6406 -287.2812 -269.8780 0.9592657
ARMA(4, 5) 0.1288373 154.1374 -286.2748 -247.9878 0.8734898
ARMA(2, 2) 0.1314526 148.7664 -285.5329 -264.6490 0.9784873

A function has been created to order the models according to their fit considering AIC (information criterion), Box-Ljung which studies that any series of autocorrelations is non-zero (Portmanteau test), logLik. Taking this information, the function penalizes the ARMAs that have higher order p+q. That is why we select the model Modelo ARMA(3, 4)which has AIC of -294.9810181. The estimated parameters are:

Table 4. ARMA estimation
term estimate std.error 2.5 % 97.5 %
ar1 0.5268825 0.0343736 0.4595115 0.5942536
ar2 0.5613215 0.0404627 0.4820162 0.6406269
ar3 -0.9450220 0.0330410 -1.0097813 -0.8802627
ma1 -1.1044121 0.0944051 -1.2894427 -0.9193815
ma2 -0.2978542 0.0946102 -0.4832867 -0.1124217
ma3 1.2724179 0.0954817 1.0852773 1.4595586
ma4 -0.4708067 0.0753000 -0.6183920 -0.3232214
intercept -0.0009775 0.0037623 -0.0083514 0.0063965

Auto-correlation functions of residuals are represented in ACF. As can be seen, the correlogram “dies” at zero so it evidently reveals to be white noise. This tells us that the residuals have no structure and therefore the model has been well specified and does not store information about the series.

The Ljung Box statistical significance gives us a robustness test: autocorrelation does not occur for any lag of the series (see order equal to 10 in figure 6 below).

The last step of Box-Jenkins corresponds to prediction. As we can see in the figure presented, the values predicted by the ARMA model follow very closely the empirical series.

2. IPSA

The IPSA series (Chile’s main stock market index) is presented in Figure 8. As can be seen in Table 5, with a 95% confidence level, the null hypothesis can be rejected. Thus, there is evidence with a 5% error that there is no stochastic trend in the series presented.

Table 5. Dickey-Fuller test for IPSA
method Valor-p statistic parameter alternative resultado 95%
Augmented Dickey-Fuller Test 0.01 -5.043973 6 stationary Es I(0), no unit-root

Regarding the graphs showing the orders of the models, we say that neither of them show a “smooth” fall towards any order, and rather they are always within the confidence interval. They only appear outside the interval at order 12, which must show some annual memory of the series. Without taking into account the confidence intervals, it could be seen that the ACF and PACF orders are quite symmetrical (from the similarity of the figures). Thus, it is possible that the significant drop occurs after the order p,q > 3.

Table 6 shows the 12 best combinations of ARMA(p,q), and as mentioned before, a function has been created to order them in such a way as to rank them considering the number of parameters, AIC settings and residuals test above all.

Table 6. ARMA(p,q) models-IPSA
Ajuste sigma logLik AIC BIC Box-Ljung test residuos p value
ARMA(3, 3) 4.065532 -678.7555 1373.511 1401.389 0.7484041
ARMA(1, 1) 4.164760 -684.2937 1376.587 1390.527 0.4704815
ARMA(4, 2) 4.119725 -680.3639 1376.728 1404.606 0.9895842
ARMA(1, 2) 4.163729 -683.7363 1377.473 1394.897 0.9721260
ARMA(2, 1) 4.164388 -683.7731 1377.546 1394.970 0.9903279
ARMA(5, 5) 4.062058 -676.9303 1377.861 1419.678 0.9573418
ARMA(4, 3) 4.127846 -680.2603 1378.521 1409.884 0.9887073
ARMA(5, 2) 4.129013 -680.3567 1378.713 1410.077 0.9896419
ARMA(1, 3) 4.170762 -683.6374 1379.275 1400.184 0.9715892
ARMA(2, 2) 4.172423 -683.7272 1379.454 1400.363 0.8881015
ARMA(3, 1) 4.172472 -683.7327 1379.465 1400.374 0.9854579
ARMA(4, 1) 4.163871 -682.7420 1379.484 1403.878 0.9915723

We select the model Modelo ARMA(3, 3), with AIC1373.4120602. The estimated parameters are:

Table 7. ARMA(3,3) Summary
term estimate std.error 2.5 % 97.5 %
ar1 0.8712972 0.3745375 0.1372171 1.6053772
ar2 0.5023087 0.6306094 -0.7336630 1.7382803
ar3 -0.7528798 0.3356288 -1.4107001 -0.0950595
ma1 -0.9396423 0.3870535 -1.6982532 -0.1810313
ma2 -0.4191528 0.6751012 -1.7423268 0.9040213
ma3 0.8107539 0.3858066 0.0545869 1.5669208
intercept 0.6433242 0.3069293 0.0417539 1.2448946

Auto-correlation functions of residuals are represented in ACF. As can be seen, the correlogram “dies” at zero so it evidently reveals to be white noise. This tells us that the residuals have no structure and therefore the model has been well specified and does not store information about the series.

The Ljung Box statistical significance gives us a robustness test: autocorrelation does not occur for any lag of the series (see order equal to 24 in Figure 9 below).

Unlike the inflation series, the IPSA forecast does not follow the observed values as closely. The kurtosis of the curves is something that the forecasts fail to achieve elegantly.

3. Exchange Rate

Figure 11 shows the exchange rate from Chilean pesos to dollars. It shows only a large shock due to the 2008 crisis, but in general it remains around the average. In Table 8 we prove that there is no conclusive evidence of unit root, so with 5% error there is no stochastic trend in this series

Regarding the orders of the models (Figure 12), at least these are clearer than in the case of IPSA. In this case it seems that the orders are not symmetric, although p > q. Now, it appears that both are close between 3 and 2, but neither correlogram is “smoothly decaying”.

method Valor-p statistic parameter alternative resultado 95%
Augmented Dickey-Fuller Test 0.01 -5.594978 6 stationary Es I(0), no unit-root

In Table 9 we see the selection of models, where we see the result of what was discussed in the previous figure, where precisely what we indicated before stands out: the pairs (3,2), (4,2) and (2,3) are those that lose less information, and less significant is their correlation tes of residuals (in 5,2 it is already 0.84).

Table 9. ARMA(p,q) Exchange RATE
Ajuste sigma logLik AIC BIC Box-Ljung test residuos p value
ARMA(3, 2) 2.380310 -548.1035 1110.207 1134.600 0.9592174
ARMA(4, 2) 2.375450 -547.1140 1110.228 1138.106 0.9793663
ARMA(2, 3) 2.381598 -548.2437 1110.487 1134.881 0.9573739
ARMA(3, 3) 2.377868 -547.3355 1110.671 1138.549 0.8419124
ARMA(2, 5) 2.375085 -546.5865 1111.173 1142.536 0.9521570
ARMA(1, 1) 2.402211 -551.7083 1111.417 1125.356 0.9867599
ARMA(5, 2) 2.378433 -546.8993 1111.799 1143.162 0.9789216
ARMA(2, 2) 2.393469 -549.9395 1111.879 1132.788 0.2429308
ARMA(4, 3) 2.380006 -547.0524 1112.105 1143.468 0.9945415
ARMA(2, 4) 2.385934 -548.1558 1112.312 1140.190 0.9863368
ARMA(1, 2) 2.403171 -551.3031 1112.606 1130.030 0.9994660
ARMA(2, 6) 2.378219 -546.3872 1112.774 1147.622 0.9952449

We select the model Modelo ARMA(3, 2), with 1110.2068976. The estimated parameters are:

term estimate std.error 2.5 % 97.5 %
ar1 0.4944275 0.1949472 0.1123380 0.8765171
ar2 -0.8592603 0.1104293 -1.0756978 -0.6428228
ar3 0.1498004 0.0822274 -0.0113624 0.3109632
ma1 -0.2140941 0.1762483 -0.5595344 0.1313461
ma2 0.8647872 0.0940674 0.6804185 1.0491559
intercept 0.1829407 0.2054703 -0.2197736 0.5856550

Auto-correlation functions of residuals (Figure 13) are represented in ACF. As can be seen, the correlogram “dies” at zero so it evidently reveals to be white noise. This tells us that the residuals have no structure and therefore the model has been well specified and does not store information about the series.

The Ljung Box statistical significance gives us a robustness test: autocorrelation does not occur for any lag of the series (see order equal to 10 in figure 13 below).

Unlike IPSA we see a much better fit of the exchange rate to the empirical series, something very similar to what happened with inflation. In fact, this model, the one that occupies fewer parameters is the one that ” follows closely the series”. This tells us that learning from the series does not imply incorporating more variables into the model, but rather how much we can understand from the data generating process we are working with. For example, some of the key questions we have asked so far are: are the series stationary, do they have a trend, does adding more orders improve my prediction?